A SHUFFLE THAT MIXES SETS OF ANY FIXED SIZE MUCH FASTER 
THAN IT MIXES THE WHOLE DECK 
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ABSTRACT: 

Consider an n by n array of cards shuffled in the foUowing manner. An element x of the array is 
chosen uniformly at random; Then with probability 1/2 the rectangle of cards above and to the left of 
X is rotated 180 degrees, and with probability 1/2 the rectangle of cards below and to the right of x 
is rotated 180 degrees. It is shown by an eigenvalue method that the time required to approach the 
uniform distribution is between n^/2 and cri^ Inn for some constant c. On the other hand, for any k it 
is shown that the time needed to uniformly distribute a set of cards of size k is at most c{k)n, where 
c{k) is a constant times fc'^ln(fc)^. This is established via coupling; no attempt is made to get a good 
constant. 
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1 Introduction 



Consider playing cards, numbered 1, . . . , n^, in an n x n array; the set of positions in this array is 
denoted by 

[nxn] = : 1 < i,j < n}, 

with (1, 1) in the upper- left corner. For 1 < i,i < n, let Wij be the permutation that sends the card in 
the (r, s) position to the {i + l — r,j + l — s) position ifr < i and s < j, and does not change the position 
of the card otherwise. In other words the rectangle of size i x j in the upper-left corner gets rotated by 
180° and the remaining cards are unmoved. (The (1, 1) position is in the upper left, following matrix 

rather than Cartesian notation.) Let tt^^- denote the shuffle that does the same for the lower right corner, 
so that the card in the (r, s) position is moved to position (n + z — r, n + j — s) if r > i and s > j and is 
otherwise unmoved. Questions about how rapidly this type of permutation mixes an array were inspired 
by a Macintosh Screensaver. 

Suppose first that the cards are shuffled by waiting a mean one exponential amount of time, then 
picking i and j uniformly at random and performing the shuffle TTjj . (Setting the problem in continuous 
time avoids the later use of more complicated versions of theorems in [1] and [2] that take parity into 
account.) After time t, the resulting distribution Sq on permutations of the positions is given by 

5*=exp(i(5o-l))t/$:e-^5('=) 

where Sq''^ is the fc-fold convolution of the measure Sq — n^^ "^^j^i ^vij- Here and throughout, random 
walks on the space of card configurations are identified with random walks on the symmetric group; in 
particular, when discussing two coupled shufHes, it will be convenient to be able to refer to the positions 
a{x) and t{x) of the same card x in two arrays starting from two arbitrary configurations, one permuted 
by a and the other by r. 

The card in the (n, n) position is unlikely to move before time cn^, which gives an easy lower bound 
on the time needed to randomize the layout. More precisely, if A is the set of permutations fixing (n, n), 
then Sq{A) > e~*/"^ since e~*/"^ is the probability that the card in position (n, n) is never moved at 
all. Thus 

15* _ f/| > e-*/"^ _ J_, 

where U is the uniform measure and | • | is the total variation distance. When t « r?, therefore, the 
total variation distance is near one and the deck is not well shuffled. The same lower bound may be 



1 



obtained by counting: the total number of permutations of cards is 

n^! = exp((2 + o(l))n^logn), 
whereas the set A/, of permutations reachable in k shuffles is at most n^'^. Thus, letting k = [(1 + e)t\, 
|5*-C/| > \SUAk)-UiAk)\ 

= 1 + o(l) - exp[2 logn(A: - (1 + o{l))n% 

which is near 1 when t « n?. It will be seen (Theorem 2 below) that the time to randomization is at 
most a constant times ln(n). 

The shuffle becomes more interesting if permutations w'^j are also allowed. If each Try and tt'^j occurs 
at rate l/(2n^), the distribution resulting at time t will be 

5* t/exp(t(5 - 1)) 'i^'J2e-'yS'^'^^ 

where S gives probability l/(2n^) to each tt^ and to each ir'^y (The dependence of S and So on n is 
suppressed in the notation.) Now the cards that take the longest to move are in positions (l,n) and 
(n, 1) and these will each be moved by time cn with probability 1 — e~'^^^. Thus the first argument 
above shows only that the deck is not at all shuffled by time t « n. The counting argument from 
before does better: setting k = [(1 + e)t\ shows that |<S* — f/| « 1 when t « in?. On the other hand, 
it will be shown that the positions of any set of cards of any fixed size, fc, will be jointly randomized 
by time cn as n —> oo. (By altering the shuffle again so that it may choose rectangles in the lower left 
and upper right corners as well, this time can be reduced to a constant when fc = 1, but not for fc > 2, 
since a pair of neighboring cards will always be stuck together for expected time cn.) This is the only 
shuffle I know of with the property that the time to randomization differs from the time to randomize 
subsets of any fixed size by factors greater than poly- log (n). In fact, k may be allowed to increase 
with n, in such a way that the time to randomize any k cards is still much less than the time to total 
randomization. To quantify this, say that an event A is measurable with respect to cards xi,. . .,Xk if 
A is a set of permutations of the form {it : {■k{xi), . . . , iT{xk)) S B} for some cards Xi, . . . , Xk^ where B 
is a subset of fc-tuples of distinct positions in the array [n x n] . Define the fc-set distance to uniformity 
of a distribution 7?., denoted \\R. — J7||fe, to be sup^7?.(A) — U{A) as A ranges over events measurable 
with respect to the positions of some set of k cards; setting k = n recovers the total variation distance. 

Theorem 1 There exists a constant c such that for any n and any k with 1 < k < n, \\S* — U\\k < 1/2^ 
whenever t > ck^{ln{k))'^nj . 
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Theorem 2 For any e > 0, lim„ |iS* — U\ = 1 when t = (1 — e)n^/2. On the other hand there is a 
constant c for which |»S* — U\ < 1/2^ whenever t > cjn^ ln(n). The same is true with S replaced by Sq. 

The author wishes to thank Martin Hildebrand for helpful comments toward the revised draft of this 
manuscript. The proofs of both theorems are based on techniques developed by Diaconis and others [1,2]. 
In particular, the second part of Theorem 2 uses eigenvalue machinery (the first part is just a counting 
argument) and the proof of Theorem 1 is a coupling argument. No new theory is developed in this paper, 
rather it is hoped that the example is interesting. 

2 Proof of Theorem 1 

Theoreml is proved via a series of lemmas that establish it for small values of k. Do not count on an 
unsubscripted c to denote the same quantity from line to line. 

Lemma 3 There exists a constant c such that for any n, \\S*—U\\i < 1/2^ whenevert> cjn. The author 
wishes to thank Martin Hildebrand for helpful comments toward the revised draft of this manuscript. 

Lemma 4 There exists a constant c such that for any n, ||<S* — ?7||2 < 1/2-' whenever t > cjn. 

Lemma 5 There exists a constant c such that for any n, ||<S* — U\\3 < 1/2^ whenever t > cjn. 

To get from each lemma to the next, and thence to the theorem, the following type of coupling 
argument is used. For each finite set of cards {xi, . . . ,Xk), a Markov chain {(ctj, tj) : t > 0} is defined 
on pairs of permutations of cards. It is a coupling of two copies of the shuffle S in the sense that 
the marginal on cither coordinate is Markov with transitions from a to anij or (ttt^j at rates l/(2n^) 
each, and that from some point onward (j{xi) will equal r(xi) for all i. (At this time the coupling is 
said to have succeeded, the initial configurations of cards having been any two arbitrary configurations.) 
Furthermore, there are constants c',S > independent of the cards xi,. . . ,Xk such that for any pair 
(co, 7"o), the probability that the coupling will succeed by time c'n is at least 6. Repeating this coupling 
j[log(l/2)/log(l — 5)] times and letting c = [log(l/2)/ log(l — 6)1 gives a coupling for which the 
probability that at{xi) = Tt{xi) for all t > cjn and 1 < i < fc is at least 1 — 1/2^. Since xi,. . . ,xi. were 
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arbitrary as were the two initial configurations, this imphcs the desired conclusion. It remains to exhibit 
the couplings, which will be done in the notation of this paragraph and without any thrift in choices of 
constants. To avoid drowning in a mire of greatest-integer brackets, ignore them, i.e., assume without 
loss of generality that n is divisible by all of the integer constants that arise in the proofs. Also, names 
such as A and B will be assigned anew for each lemma. 

Proof of Lemma 3: For each starting position (i, j) G [n x n], consider the set of possible positions to 
which a card in that position may jump under a single permutation, Wrs or n'^g. This is just the set 
{(a, 6) : {n+1 — a — i){n+l — b — j) > 0}; pictorially, rotate by 180° to get the point {n+l — i,n+l—j), 
then divide the array into (unequal) quadrants meeting there and the possible jump set will consist of 
the upper-left and lower-right quadrants; the jump set is the shaded region in figure 1. Let A C [n x n] 
be the region i, j < n/3 and let B he the region i, j > 2n/3; sec figure 2. Observe that for any card 
Xi, in any position the rate at which xi jumps into the region A U B is at least l/(3n). Indeed, 

the area of intersection of AU B with the shaded region in figure 1 is minimized when (i, j) = (1, n) or 
{i,j) = (n, 1). It is therefore possible to construct a coupling where at rate l/3n, independent of the 
past, both coordinates, a and r, simultaneously jump to permutations for which the card Xi is in AuB. 
Call the first time this happens T. Prom the pictorial description of the jump set, it follows that any 
two positions in A Li B have at least n^/3 positions in common to which both may jump suffices 
for our argument and is more immediate). 

To finish the argument, let C denote the set of positions reachable in a single jump from both arixi) 
and tt{xi). Then the probability that the process {crt{xi) ■.T<t<T+l} contains precisely one jump 
and that ax+iixi) € C is at least |C|/2n^ times the probability of exactly one jump, and therefore at 
least (l/6)e~^. The same is true for the process {tt{xi) : T < t < T + 1}. Thus the laws of <tt+i{xi) 
and tt+i{xi) both dominate a measure uniform on C with total mass and the coupling may be 

extended to time T + 1 in such a way that the P(aT+i(a;i) — tt+i{xi)) > e~^/6. The coupling then 
succeeds in time 3n + 1 with probability at least P(T < 3n)e~^/6 > e~^(l — e~^)/6 which proves the 
lemma. □ 

Proof of Lemma 4: A useful observation is that if cards xi and X2 are both some minimal distance d 
from any edge of the array, and some permutation iTij is applied which moves xi but not X2 , then further 
application of any nki with n — d/2 < k,l < n — d/A sends both cards to positions at least d/A distant 
from any edge of the array. Some notation for distance from the set of positions distant from any edge 
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will also be useful. Let Aj C [n x n] be the set of positions 

iihk) : ——rr < i,k < n — -r-^^j. 

Let Be [n X n]^ denote the set 

{((«i,Ji),(i2,j2)) G (^4)^ : max(|ii - 12!, |ji - ^2!) > n/40 ; 

of pairs of positions in A4 separated by at least n/40 in at least one coordinate. Define Bq to be the set 
of pairs of positions, one of which is in A2 and the other of which has both coordinates less than n/40. 
Let C C [n X n]^ be the set 

{((«!, ii),(j2,i2)) C {Aef :mm{\h-i2\,\ji-j2\) > n/160}. 
Finally, let D be the set of pairs of coordinates {((«i,ii), («2)i2)) : < 'n/3,i2,j2 > 2n/3}. 

Pick any distinct cards Xi and X2, and suppose the positions, (ii, ji) and {12, 32) of both cards are 
in A2. Either ii ^ 12 or ji ^ J2; assume without loss of generality that ii 7^ «2, since the argument is 
symmetric in i and j; furthermore, assume without loss of generality that ii < 12, since the argument 
is symmetric in the two copies of the shufHe. If we choose j so that ji < j < ji + n/20, then the 
permutation in^j moves xi to a position (1, b) with 6 < n/40 and does not move X2- The positions of the 
cards now differ by at least n/40 in the second coordinate. Since any permutation irui with k, I > 39n/40 
will move both cards, it will also preserve their separation; applying the observation at the beginning 
of this proof (with d = n/20) shows that there arc at least n^/6400 permutations 7r,;j whose further 
application will result in the cards xi and X2 having a pair of positions in B. It has thus been shown 
that 

Whenever xi, X2 € A2, the rate of jumping to a pair of positions in Bq is at least l/(160n); 
when the pair of positions is in Bq then the rate of jumping to a pair in B is at least 1/12800. 

Similar reasoning shows that whenever the pair of positions of xi and X2 is in B, the probability that 
the pair of positions will be in C two jumps later is at least a constant, c: there are at least n^/25600 
permutations TTab moving one card into the the region 

{(r,s) : 1 < r,s < n/160} 

while keeping the other card fixed; these also separate the cards by at least n/80 in both coordinates; 
from here, any tt^s with 319n/320 > r,s> 159n/160 will land the pair of positions of Xi and X2 in C. 
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A final observation along these lines is that whenever the pair of positions of xi and X2 is in C, the 
probability of finding the pair in D three jumps later is at least another constant. The three moves 
which may be necessary are: if X2 is above and to the left of Xi, then apply any TTy with i,j > 159n/160 
(otherwise, skip this step); now if is the new position of Xi, then < 319n/360 and any Wki 

with (ii,ii) < {k,l) < (ii,ji) + (n/320,n/320) will move into the upper-left corner without disturbing 
X2; the separation between the cards is still at least n/320 in at least one coordinate, and the coordinates 
{i2,j2) of the second card arc at least n/320, so there are at least n^/102400 TrjJ.; moves that will get X2 
into the lower-right corner without disturbing xi. 

A useful and self-evident principle when coupling two identical copies of a countable recurrent Markov 

chain is that if the rate to jump from each state in the set G into the set S is at least S, then a coupling 
{Xt, Yt} and a time T exist such that Xt-, Yt- E O, Xt, It <= S, and such that the Lcbcsgue measure 
of {i < T : Xt,Yt G 0} has exponential distribution with mean 1/S. [One way to establish this is to 
define two independent copies {X^,Y^}, altered in any way that reduces the jump rate into S by 5 at 
each state in O, to let Z be an independent poisson process of rate i5, to let T be the first time t at 
which Zt- ^ Zt while Xt, Yt € 6, and to let Xt = X^ and Yt = V/ for t' < t, while Xt and Yt jump 
into S with whatever distribution was subtracted before, and then the two evolve independently.] 

Thus the lower bound on the rate of jumping from a pair in A2 to the set Bq gives rise via this 
principle to a coupling {at, Tt} and a time T at which a and r simultaneously jump into Bq. Use this 
coupling just up to the time T, and then for T < f < T -|- 6, let cr and r evolve independently. Now 
essentially copy the argument at the end of the proof of Lemma 3. The probability of precisely 6 jumps 
occurring in at in the interval {T,T + 6] is e~^6^/6! > 1/7; conditional on this, the probability that the 
pair of positions of xi an X2 under (Tt+6 is in D is at least the product of the three constants above (one 
constant to get to B in one jump, one to get to C in two more jumps and one to get to D three jumps 
after that). Since Tt behaves identically, the probability of the event G is at least a constant, where G 
is the event that the pairs of positions of xi and X2 under both or+e and tt+6 are in D. 

Finally, observe that conditional on G, a and t may be coupled by time T + 8 with probability 
bounded away from zero: let a and t both jump exactly twice, using some T^i-^.j-^ and TTi^.ja (as in the 
proof of the preceding lemma) to send xi to the same position in [n/6 x n/6] and using some tt^^ and 
TT j2 to send X2 to the same position in the lower- right square of this size. All that remains is to bound 
the stopping time, T. 
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By the previous lemma there is a A; such that t > kn imphes ||iS* — t/||i < .01. This imphes that for 
t > kn and any card x, P{S*^{x) e A2) > U{A2) — .01 = .8. Thus the two independent copies of the 
Markov chain {a'^} and {r^} used to construct the coupHng must satisfy 

P{al{xi),ai{x2),Ti{xi),Ti{x2) G A2) > 1-4(1- .8) = .2 

for any t > kn. In particular this implies that if M C [kn,2kn] is the set of times t for which the 
positions of crj(a;i), (Tj(a;2), T/(a;i) and tI{x2) are all in A, then 

.2kn < EA(M) < Akn + nP{X{M) > Akn), 

where A is Lebesgue measure, and solving this gives P(A > Akn) > .1. The coupling is constructed so 
that 

P(T < 2kn\X{M)) > 1 - exp(-A(M)/160n). 
Thus P(T < 2kn) > (.1)(1 - exp(-A;/1600)). 

This, together with the success of the coupling by time T + 8 with constant probability, proves that 
the coupling succeeds by time 2fcn + 8 with some constant probability, which suffices to prove the lemma, 
since the coupling may be restarted at times that are multiplies of 2kn + 8 until is succeeds. □ 

Proof of Lemma 5: This proof uses similar moves to the last proof, so only the new part will be described. 
Let xi,X2 and X3 be any three cards. By the previous lemma, choose a k for which ||<S* — ?/||2 < 1/4 

when t > kn. Construct the coupling by first letting a and r evolve independently for time kn. Let 
(aj,a^) denote the position of <Jt{xj) and {bj,b'j) denote the position of Tt{xj); for convenience, define 
ciq = = Oq = = 1 and a\ = b\ = a\ = b1 = n. Let 

Mt = min{|af - a']\, - b']], |af - b% : k = l,2;i j;0 < i,j < 4}. 

Thus under both a and r, all cards x-i,X2 and xs are separated in each coordinate by Mt from each 
other and from the boundary of the array, and for i j, a-t{xi) and Tt{xj) are separated as well. 

Under the product uniform distribution, {U x U), observe 

{U X U){Mt < n/240) < .3; 

this is because the event {Mt < n/240} is the union of 36 events of probability at most 1/120: 12 events 
that some coordinate of some card under one of at or T( is within n/240 of or n, 12 events that some 
coordinate of (Jt{xi) is too close to the same coordinate of Tt{xj), 6 events that some (Jt{xi) and at{xj) 
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are within n/240 in some coordinate, and 6 events that some Tt{xi) and Tt{xj) are within n/240 in some 
coordinate. 

Therefore P(Affe„ < n/240) < 3/4, by choice of fc, since Mkn is an event depending only on the 
positions of two cards. Conditional on Mfe„ > n/240, akn+5 and Tkn+5 may be coupled so that the 
positions of all three cards xi , X2 and X3 are the same under a and t with probability bounded away 
from zero. The five moves that may be necessary arc: (1) couple a{xi) and t{xi) by moving them both 
to the upper left n/720 x n/720 square; (2) move this coupled card into the bottom right n/1440 x n/1440 
square by time kn+2; (3) couple (t{x2) and t{x2) in an even smaller upper-left region; (4) move X2 to the 
region in the lower-right (but not all the way in the corner) defined by ■ n/360 < i,j < n/720}; 

note that this does not disturb xi; (5) couple X3. □ 

Proof of Theorem 1 from Lemma 5: The method used to prove Lemma 5 may be generalized to any k 
but the coupling time is then exponential in fc. To get a power law in fc, it is necessary to construct a 
less wasteful coupling. When k > \fn, k^n > Inn, and Theorem 1 is subsumed in Theorem 2. So no 
generality is lost in assuming that k < y/n. Fix any k cards, xi,. . . ,Xk- A sequence of stopping times 
will be defined at which the probabilities of certain "good" events occurring in the near future is large. 
The stopping times are called {T{u, v) : 1 < u < k,l < v < l{u)} and {Tj : < j < k} and when j > 1, 
they satisfy 

T,_i < T{j, 1) < T{j, 1) + 1 < T{j, 2) < • • • < T{j, < T{j, + 1 = Tj. 

Informally, at each T{u,v), either something good happens one time unit later, in which case T„ = 
T{u, v) + 1 and l{u) = v, or else we wait for the next auspicious time, T{u, v + 1). 

Describing the behavior of the coupling between times T{u,v) and T{u,v) + 1 takes a little notation, 
but at all other times the construction is simple. Let {at,Tt) evolve independently until time Tq. For 
t e [Tj,T{j + 1, 1)] and for t G [T{u, v) + 1, T{u, v + l)],v < l{u), let a and r evolve in parallel, so that 
a jumps to an if and only if t jumps to ttt. No technical problems arise in switching between these 
behaviors as long as the T{u,v) are honest stopping times and the event {l{u) = v} is in the cr-field of 
events up to time T{u, v) + 1. 

To handle the remaining times, define W{t) to be the set {s < k : at{xs) = Tt{xs)}. Informally, this 
is the set of cards whose positions are the same under cr and r at time t. Since the coupling depends on 
knowing something about the configurations at times T{u,v), we begin by defining those. First, define 

To = inf{t > : at{Xs) 7^ Tt{Xsi) for all s, s' < k}. 
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Clearly this is a stopping time, and W{Tq) = 0. It will be verified inductively that 



W{s) C W{t) for To<s<t, \W{Tj)\ = j, and \W{T{u,v))\ = u-l. (1) 

It will also be verified that at{xs) ^ Tt{xs') for all t > Tq and s ^ s'. Since a and r move in parallel 
except ont G [T{u, v),T{u, v) + 1] and since these two statements are true at time Tq, we need only verify 
that they remain true over the time intervals [T{u,v),T{u,v) + 1]. For any u < k and 1 < v < l{u), 
define 

3s = s(u,v) 4 W(T^_i)s.t.at(xs),Tt(xs) e [1, -^1 x [1, -^1 
T{u,v) = mf{t>T{u,v-l) + l: ^'^^ V«i; t\ sj, t\ s) i , g^j l , g^j 

and atiXs'),Tt{x,,) i [1, ^] x [1, ^] for s' 7^ s). 

Define T(u, 1) identically, but with T„_i in place of T(u, — 1) + 1. Informally, T(u, w) is the first time 
after T(u, — 1) + 1 (or T„_i if w = 1) that some card Xs' not yet in is sent to a square region in 
the the top-left corner by both a and r, while all other cards are sent to a region in the lower-right that 
is the complement of a slightly larger square region. Clearly, these arc stopping times and W cannot 
change on \r{u,v — 1) + l,T(u, t;)] because a and r are evolving in parallel. 

For each a,b<n/ (6\/^), there are unique i{a, b),j{a, b) < nj (2\/fc) for which Tr^ \ot{u,v) (a^s)] = (a, 
while TTjj [^^(u^^) (a;^' )] = ct(u,-u)(^s') for s' ^ s. The same is true with ut(u,v) replaced by rx(„^„); call 
these z*(a, &) and j*(a, 6). It is therefore possible to choose a pair (tt, tt*) in such a way that each of tt 
and TT* is uniform over {i^xy '■ ^ < x,y < n}, that 

P(7r = 7ri(a,6)j(a,6),7r* = TTj. („,(,)) > — , (2) 

and that with probability one, either tt = tt* or else 

71" = 7rxy,7r* = TTx'y* for some x,y,x*,y* < (3) 

For a single shuffle, iS*, the probability of precisely one jump occurring in a unit of time and that jump 
being a Tr^ rather than a tt-^- is l/(2e). By this observation and (2) and (3), we may construct the 
coupling for t e [T(u, w), T(u, t;) -|- 1)] so that with probability 1 — l/(2e) the two processes u and r 
evolve in parallel, jumping either zero times, more than once, or jumping exactly once by some tt^^- , while 
with probability l/(2e) the two processes jump exactly once by some tt and tt* picked from the joint 
distribution described above. 

Define = v\i this last possibility occurs (jumps of tt and tt*) and if furthermore, tt = T^i{a,\)),3(a,b) 
and TT* = 'ni'{a,h),3*{a,h) for some a,b < n/(6\/fc). This is of course measurable with respect to events 
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until time T{u, v) + 1, and when it occurs, W{T(u, v) + l) = W{T(u, v)) U {xg}), with Xs = Xs{u,v) being 
the witnessing card for the stopping time T{u, v). In this case, T„ is defined to equal T{u, v) + l and the 
inductive statement (1) is verified. On the other hand, if l{u) > v, then W{T{u,v) + 1) = W{T{u,v)), 
since either the shuffles evolved in parallel or else (3) guarantees that no card Xs' other than Xs was 
moved by either shuffle. Thus again, (1) is verified. In either case (parallel shuffles or no card Xg' other 
than Xs moved by either shuffle), it is clear that the statement at{xr) ^ Tt{xr') is preserved for all r ^ r'. 

A consequence of (1) is that all k cards are coupled by time T^. Thus to prove the theorem it sufflces 
to find a constant c for which 

P[Tfc > cnk^{ln{k)f] < 1/2. (4) 

Let J^{t) denote the cr-field of events up to time t. We begin by showing that ETq < cn In(fc). Using 
Lemma 3 for t = conln{k), with cq > 3c/ In 2, gives 

||5^-[/||i<^. 

Then for this t, P{(Tt{xg) = Tt{xs')) < 1/fc'^ + 1/n^ for each fixed s,s' < k and summing gives a 
probability of at most 1/k + k^/n^ that some (Ti {xg) = Ti{xg'). Since 4 < fc < y/n in any nontrivial case, 
this probability is bounded above by 1/2. Repeating this argument at times that are multiples of t shows 
To to be stochastically dominated by t times a geometric of mean two, proving that ETq < cn In(fc) for 
an appropriate c. 

Next, we establish that 

(i) E{T{u,v + l)-{T{u,v) + l)\J^{T{u,v) + l)) < cn^^ 

{ii) E(r(w,l)-r„_i|^(r„_i)) < cngM^ ' 

By Lemma 5, choose r = cn In(fc) so that Ills'" — UW^ < l/(400fc'''). Write B for the region [1, n/(3-\/fc)] x 
[l,n/{3Vk)] and write C for the region [1, n/(2\/A:)] x [1, n/(2\/fc)]. Pick any s ^ W{u) and let y = 
o'T(u.v)+i{xs) and z = tt(u,v)+i{xs)- The set Q of permutations tt for which Tr{y) € B and Tr{z) € B has 
probability 

^^^^ ^miok'^)- WOP 

under the uniform distribution. The permutations aT(u,v)+i+r{<^T(u ■^T(ti.t))+i+r('^'r(M v)+i^ 

equal and their conditional distribution given !F{T{u,v) + 1) is the distribution of iS''. Since r is chosen 
to make Ills'" - U\\2 < \ \S'' -U\\3< l/(400fc5), it follows that 

P(CTT(u,«)+i+r(a;s) e B and TT(u,v)+i+r{xs) e B I J^{T{u,v) + 1)) > - . (5) 
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For w ^ y,z, 

U{tt : TT{y) G B,tt{z) G B and t:{w) G C} < 
Setting w = 'yT{u,v)+i{xs') for some s' ^ s and using HiS*" — U\\3 < 1/(400A:^) again yields 

P{(^Tiu,v)+i+r{xs) e B and TT{u,v)+i+r{xs) € B and aT{u,v)+i+r{x's) e C|^(T(u,w) + 1)) (6) 
1 1 



< 



324fc3 400fc5 ■ 



If wc instead let w ~ tt(u.v)+i{xs'), we see that the same is true with aT{u,v)+i+r{x's) G C replaced by 
TT(u,v)+i+rix's) £ C'. Let G{u,v,s) be the event that aT{u,v)+i+r{xs) G B, that rT(M,t,)+i+r(a;s) G i3, 
and that for all s' ^ s, aT{u,v)+i+r{x'g),TT{u,v)+i+r{x'g) ^ C. Then summing (6) over s' ^ s, doubling, 
and subtracting from (5), gives 

P(G(u, V, s) I :FiT{u, ^) + l))>l---4--2fcf-i-^ + -^ ^ 



WOk^ 400fc5 V324A;3 400^^^ - 400A;2 ' 

since fc > 4. The events G{u,v,s) are disjoint as s varies. Recalling that T{u,v + 1) has been reached 
when G{u,v,s) occurs for some s ^ W(T„_i) and summing over such s gives 

P(T(«, ^; + 1) < T(«, t;) + 1 + r I :F(T{u, v) + 1)) > ^^^J^- 

Comparing to another geometric random variable, recalling the value of r and rolling all constants into 
one gives 

E(T(u, v + 1)- T{u, v)-l\ T{T{u, v) + 1)) < cn^^^^. 

k + 1 — u 

This establishes {i) above, the argument for {ii) being identical. 

By construction, P(T„ = T{u,v) + \ \T{T{u,v))) > on the event l{u) > v. This implies 
m{u) < 36k. Thus, setting T{u, 0) = r„_i, 

E(T„-T„_i) = E^(T(u,^;)-T(u,^;-l)) 



E ^ l;(„)>„(T(w, t;) - T{u, v-1)) 



= E 



J2 l;(„)>.E(T(w, t;) - T(n, v-l)-l\ T{T{u, v) + 1)) 



.j;=l 



+ m{u) 
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oo 



e ln(/c) 
k + l-u 



.v=l 



= El{u)cn 



e In(fc) 



k+l-u' 



Summing over u gives E(Tfc — To) < cnfc'^(ln(fc))^, and using the earlier estimate on ETq shows that 



The proof of the nontrivial part of Theorem 2, namely the upper bound, is gotten by analyzing the 

eigenvahies of the random walk on S'„2 whose steps have distribution S. To abbreviate the terminology, 
say the eigenvalues of a probability distribution P are the eigenvalues of its random walk, and if P is 
uniform on some set A, call these also the eigenvalues of A. 

The eigenvalue analysis is done in three steps. Define another shuffle TZ which chooses a three-cycle 
uniformly from among all 2 ) three-cycles at total rate one. (A three-cycle permutes three cards 
cyclically and leaves the remaining — 3 cards untouched.) The first step, Lemma 7 below, compares 
the eigenvalues of S with the eigenvalues of TZ, This relies on a lemma from [2], Lemma 6 below, which 
bounds the eigenvalues of one shuffle in terms of the eigenvalues of a second, more tractable, shuffle when 
the permutations in the second shuffle are explicitly written as products of permutations in the first 
shuffle. The second step is to compute the eigenvalues of TZ. This is done via the representation theory 
of the symmetric group, and can be read off from known results in [3]. Finally, the information about 
the eigenvalues of S is used to get an upper bound on the difference between 5* and U in total variation, 
and hence on the time to randomization. This argument closely parallels the proof of Theorem 5 in [1, 
ch. 3], which does an analogous computation but for transpositions instead of three-cycles. 

Lemma 6 (Diaconis 1992) Let Ai,A2 C S'„ be sets of permutations that generate Sn and are sym- 
metric, i.e. w G Ai if and only if 'jt~^ £ Ai. For each n G A2, pick a way of writing n as a product of 
elements of Ai; let N{a,Tr) denote the number of times a appears in this product and let \it\ denote the 
number of factors in the product. This defines a constant 



ETfc < cnfc^ (ln(A;))^. Since Tk is positive, this implies (4), which proves Theorem 1. 



□ 



3 Proof of Theorem 2 



B = I , I max 
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Let Si be the uniform distribution on Ai. Choose any subspace V C C'^" which is invariant for the right 
regular representation of Sn and let Xi > X2 > ■ ■ ■ > Xk be the eigenvalues of S2 on the subspace V in 
descending order, counted with proper multiplicity. Writing the eigenvalues of <Si on the subspace V as 
X'l > X'2 > ■ ■ ■ > X'l^, the relation 

1-\<B{1- A'J (7) 
holds for i = 1, . . . ,k. □ 

Proof: Let £ be the Dirichlet form for 52, namely the symmetric, positive definite form on C"^" defined 
by S{fJ) = < {I -S2){f),f >, where S2{f){z) = 1^21"^ J2xeA2 fi^^) <>> ^^^^^ i™^^ 

product. Let £' be the Dirichlet form for Si. Then Theorem 1 of [2] shows that 

S < BE'. 

Lemma 4 of [2] then implies (7) when V is all of C"'" . If y is not the whole space, then observe that V 
has an orthogonal complement which is also an invariant subspace. Thus the Dirichlet forms £ and 
£' decompose into the direct sums of forms on V and V-^. The relation £ < B£' must then hold on V, 
and the proof is again finished by Lemma 4 of [2] . □ 

Lemma 7 Let Ai > A2 > • • • > A„!_2 be all the eigenvalues of the shuffle TZ except for the two eigenval- 
ues of +1 which occur on the one- dimensional invariant suhspaces V+ = {/ : f{x) = f{y) for all x,y} 
and V- = {/ : f{x)sign(x) ~ f{y)sign(y) for allx,y}. Let X[ > A2,> ••• > Ki\-2 eigenvalues 
of So on the space V± ~ {f : J2fi^) — S = 0} which is the orthogonal complement of 
(V+ (BV-). There is a constant c such that for all i < n\ — 2, 

(1 - Xi) < c(l - A^). 

The same holds when So is replaced by S. 

Proof: We first handle the case of Sq. To apply Lemma 6, let Ai be all the iHj and let A2 be all the 
three-cycles. Picking ways to write elements of A2 as products of elements of Ai requires several steps. 
Let A3 C A2 be the three-cycles that permute three array elements {ir,jr) '■ f = 1,2,3 for which the 
coordinates are distinct from each other and the coordinates are distinct from each other. For 
n > i,j > let Xij and Yij be the following product of elements of Ax (commas are introduced for 
clarity and the notation for products is left- to-right, so that -ku means first do tt then a): 

^ij = T^i,jT^i-l,jT^i-2,jT^i-l,j 
Yij = XijXij-iXij-2Xi,j-l- 
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For n > i > 3 > j, let Xij be defined as above and let Yij = Xij. For n > j > 3 > i, let Xij = iTij and 
let Yij = XijXij-iXij-2Xij-i as before. Finally, if 3 > let Xij = Yij = nij. 

Claim: Yij is the permutation that transposes the i, j-elemcnt of the array with the top clement T, 
and in addition, if i, j > 2, transposes the i, 1-clomcnt with the 1, j-clcment. The proof of this is omitted, 
being a case by case verification; the figure illustrates the case i = j = 5. 
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54 53 52 15 51 55 12 13 14 51 

21 22 23 24 25 21 22 23 24 25 

— >(X53) 31 32 33 34 35 — ^(^54) 31 32 33 34 35 

41 42 43 44 45 41 42 43 44 45 

14 13 12 55 11 15 52 53 54 11 



Next, for pairs (ii, ji), («2, J2) both unequal to T and satisfying ii ^ ii and ji ^ 72, let 

^il,jl,i2,j2 ~ -^tjl ■^2J2■^lJl■^2,j2• 
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It is easy to see that -^iui, 12,^2 is the three-cycle permuting T, the 12, j2-element and the ii, ji-element. 
Finally, for ii,ji,i2,j2,i3,j3 with none of the v's equal to another, none of the jV's equal to another 
and no pair (v, >) equal to (1, 1), let 



»1>J1.»2,J2,»3 ,J3 



'»1 . jl >»2 , j2 -^12 , J2 , 83 , js • 



Then Wi^j^^i^j^^i^j^ cyclically permutes the is, ja-element, the i2,i2-element and the ii, ji-element. If 
TT G A3 is a three-cycle that permutes three array elements (is, ,73), (42, ^2) and with ir,jr > 2, 

pick the decomposition of tt into elements of Ai according to the construction of Wi-^_j-^_i2,j2,i3,33^ '^^ '^^'^ 
of the pairs {ir,jr) is equal to (1, 1), then use the appropriate Z instead of W. In the obvious notation, 
|7r| = \Zij^j^^i^j^\ + \Zi^j2,i3,j3\ ^ 128. Furthermore, for any a = iTij € Ai, the number of tt € 
for which N{a,iT) > is at most 27n^, since one of the pairs {ir,jr) must satisfy i < v < i + 2 and 
j < jr < j + 2. Thus 



for any a G Ai. 

For TT G A2 \ A3, decompose it into a product of elements of Ai as follows. If tt permutes the ir,jr- 
elemcnts for r = 1, 2, 3, choose fi) and {u2, V2) from among the set {(x, y) : 3r with — ir| + |?y^ir| < 
6} in such a way that each Ug is distinct from each v, each Vg is distinct from each jV-, and ui 7^ U2 and 
vi / V2- Writing a,b,c,d,e for (12,^2), (isjjs), and (u2,i'2) respectively, decompose tt as 



It is easy to check that this does indeed give tt and that for tt e A2 \ A3, the decomposition satisfies 
|7r| < 640. Furthermore, the number of tt € A2 \ A3 for which 7V(7ry , tt) > is bounded by the number 

of ways of choosing three array elements in such a way that some two are in the same row or column 
and one is within a distance 6 of in the taxicab metric. This is at most cn'* for some constant c. 

Applying Lemma 6 with n > 10 now gives 1 — Ai<B(l — where 




B 




< 



2 



( 



3 



2 



) 




\Tr\N{a,n) 



< 3.1n-^(128 • (128 • 27n^) + 640 • (640cn^)) 
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which is bounded by some constant, proving the lemma for Sq. For S, use the same decompositions, 
losing a factor of two in |Ai|/|^2|- D 



It has been shown that the eigenvalues of S are bounded in terms of the eigenvalues of TZ; the 
computation of these latter uses a combinatorial formula from [3]. Let p be any irreducible matrix 

def 

representation of Sn- Since the measure TZ is uniform on conjugacy classes, the matrix 'R-{p) = Etz{p) 
will be a constant multiple of the identity, the constant being Xpi^) I d{p), where Xp is the character of the 
representation p and r is any element of the conjugacy class, in other words, any three-cycle. This gives 
d{p) eigenvalues equal to Xp{''')/d{p) in the irreducible representation p, and since this representation 
appears with multiplicity d{p), the shuffle TZ will have this eigenvalue with multiplicity d{p)^. Ingram's 
formula for the characters of the irreducible representations of Sn evaluated at a three-cycle yields the 
following upper bounds: 

Lemma 8 Let p be the irreducible representation of Sn corresponding to the partition t = {ti > t2 > ■ ■ ■) 
of n. Then the character of p evaluated at a three-cycle is given by 

r{p) '^Xp{r)ld{p) = - (8) 

n[n — l)(n — 2) Z(n — 2) 

where the sum is over all {i,j) such that ti > j, or in other words over all squares of the Young tableau 
for the partition t. It follows from this that 

, , , 3(^1 - l)(n-ti) , 

(n-l)(n-2) "^-"^-^-/^ 

and 

r{p) < max{ti - 1, t[ - l}/(n - 2) when ti,t[ < n/2, 
where t'l = max{z : ti > 0} is the first element of the partition dual to t. 

Proof: The formula (8) is taken directly from [3, (5.2)], where the term a{a + l)(2a + 1) is replaced by 
QY^i^ii^ and the typographical error (a misplaced parenthesis) is corrected. For fixed ti > n/2, the 
sum is maximized by letting t2 = ■ ■ ■ = tn+i-ti = 1 and ti = for i > n + 1 — ti. For the trivial 
representation, t = n,0,0, . . . and r = 1. Comparing (8) for the trivial representation and a nontrivial 
representation p gives 

'n—t\ 

j2{{ti-i+kr-k') 
.fe=i 



1 - r{p) 



> 



n(n- l)(n-2) 
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n(n — l)(n 


-2) 


3 




n{n — l)(n 


-2) 


3 







n— ti 

5^ (ti - If + 2k{h - 1) 
.fe=i 

[(n - - 1)^ - (n - ii)(n - + l)(ti - 1)] 
[(n-ii)(ii-l)]. 



On the other hand, when ti,t[ < n/2, then let to = max{ii, f'^}. Ignore the subtracted term in (8) to 
get 

"■^^^ n(n-l)(n-2)" 

Partition the n pairs according to the value of i and observe that for any i, the average of the 

summands with that particular value of i is 

t-'Y.^J-^f < io^E(i-l)^ 

= (io-l)(2to-l)/6. 

This is then an upper bound for the average of all the summands; the sum is precisely n times the 
average, yielding 



2(n-l)(n-2) " 2(n - 2) ' 



□ 



The bound (9) below on the time to randomization for the shuffle <S in terms of its eigenvalues is 
based on the Upper Bound Lemma (3b. 1) from [1]; the evaluation of (9) is based on the analogous 
computation for random transpositions on pages 41 - 42 of [1]. Accordingly, some details are omitted 
here. 

Proof of Theorem 2: Let the eigenvalues of TZ and <S be denoted respectively by Aj and A^, listed in 
the following order: Ai = A'^ = 1 are the eigenvalues on V+; A2,A2 are the eigenvalues on V-, with 
A2 = 1 > A2; A3 > • • • > A„! and A3 > • • • > A^j are the eigenvalues on Using the constant c from 
Lemma 7 and Lemma 3B.1 of [1] gives 



4|5ct_jj|2 V |5^*(7r) - C/(7r)|2 
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^g-2ct(l-A; 



i>2 



< e-2ct(l-A2) _^ ^ g-2i(l-Ai 



i>3 



= g-2ct(l-Ai) 



+ ^ d(pf exp[-2f(l-r(p))], (9) 

p 

where ^ denotes a sum is over representations p other than the trivial representation and tlie alternating 
representation. 

Wc now bound (9) using Lemma 8. First dispose of the e"^^*'^^'^^) term. Since the alternating 
character is ^ sigii{a)S{a) and the sign of tt^ is negative when (among other cases) i is odd and j = 2 
mod 4, the alternating character is at most 3/4, and 

g-2ct(l-A^) < (,-ct/2^ 



For the remaining sum, observe that if p and p' correspond to dual partitions t,t' then d{p) = d{p') 
and r{p) = r{p'). Since the trivial and alternating partitions are dual, this gives 

Y,*d{pf exp[-2t(l - r{p))\ < 2^** dipf exp[-2t(l - r(p))] 
p p 

where ^ is over nontrivial partitions with ti > t'l- Note that for t > n/2, 



^ 3ft-l)(n-t) ^ ^ 3ft-l) / 

n{n — 1) n — 2 \ n — 1 J 



< 



n- 2 

and thus for any a G (0, 1/2), the above expression involving is at most 

p:ti>(l-a)n ^ ' p:tx<(\-<x)n 

Diaconis now shows [1, proof of Theorem 5, page 42] that a £ (0, 1/4) may be chosen so that when 
t > (l/2)nln(n) + kn, both sums together are less than ae"^*^ for some universal constant a. This shows 
that |<S'^* — U\ goes to zero when t = {.5 + e)nln(n), proving Theorem 2. □ 
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